Add stream parameter to check if download is required or not #4

Sujanadh · 2025-06-20T12:31:31Z

Description

Added a stream parameter in the RawDataClientConfig in order to avoid unnecessarily downloading data extracts on disc.

… not

for more information, see https://pre-commit.ci

spwoodcock · 2025-06-20T15:20:05Z

osm_data_client/client.py


-async def get_osm_data(geometry: Union[Dict[str, Any], str], **kwargs) -> RawDataResult:
+async def get_osm_data(
+    geometry: Union[Dict[str, Any], str], stream: Optional[bool] = None, **kwargs


Here I think adding the stream param is not necessary, as this is captured by **kwargs

To get it from **kwargs, we can do something like this:

config = RawDataClientConfig.default() if (stream := kwargs.pop("stream", False)): config.stream = stream client = RawDataClient(config=config) return await client.get_osm_data(geometry, **kwargs)

for more information, see https://pre-commit.ci

spwoodcock · 2025-06-20T15:37:59Z

I also replaced typing.[List,Dict,Union,Tuple] with the equivalent typing syntax built into Python now

…/raw-data-api-py into fix/allow-stream-data-url

for more information, see https://pre-commit.ci

spwoodcock · 2025-06-20T16:58:39Z

I refactored this to use RawDataOutputOptions, hope you don't mind @Sujanadh!

@emirfabio would love if you could review this to see if it makes sense / if you prefer another approach.
There is currently also one test failing - I didn't get to dig into it

spwoodcock · 2025-06-23T21:23:13Z

@Sujanadh I'm not sure if Emir is around to check this over, but what do you think?

Is it working for you?

Sujanadh · 2025-06-25T07:54:19Z

looks good to me.
But response seems like this:

{
  "metadata": {
    "task_id": "9aa3a4e0-7252-4ba6-89f9-8c730c2c2afd",
    "format_ext": "geojson",
    "timestamp": "",
    "size_bytes": 63219,
    "file_name": "fmtm_data_extract",
    "download_url": "https://s3.dualstack.us-east-1.amazonaws.com/production-raw-data-api/default/fmtm_data_extract_geojson_uid_9aa3a4e0-7252-4ba6-89f9-8c730c2c2afd.geojson",
    "is_zipped": false,
    "bbox": null
  },
  "path": null,
  "data": {
    "download_url": "https://s3.dualstack.us-east-1.amazonaws.com/production-raw-data-api/default/fmtm_data_extract_geojson_uid_9aa3a4e0-7252-4ba6-89f9-8c730c2c2afd.geojson",
    "file_name": "fmtm_data_extract",
    "process_time": "a second",
    "query_area": "0.04 Sq Km",
    "binded_file_size": "0.06 MB",
    "zip_file_size_bytes": 63219
  },
  "extracted": false,
  "original_path": null,
  "extracted_files": null
}

There are repeated keys and values in both metadata and data . We can keep metadata such as process time, filename, query area and so on in the metadata section itself. Need to refactor it.

spwoodcock · 2025-06-25T08:42:50Z

looks good to me.
But response seems like this:

{
  "metadata": {
    "task_id": "9aa3a4e0-7252-4ba6-89f9-8c730c2c2afd",
    "format_ext": "geojson",
    "timestamp": "",
    "size_bytes": 63219,
    "file_name": "fmtm_data_extract",
    "download_url": "https://s3.dualstack.us-east-1.amazonaws.com/production-raw-data-api/default/fmtm_data_extract_geojson_uid_9aa3a4e0-7252-4ba6-89f9-8c730c2c2afd.geojson",
    "is_zipped": false,
    "bbox": null
  },
  "path": null,
  "data": {
    "download_url": "https://s3.dualstack.us-east-1.amazonaws.com/production-raw-data-api/default/fmtm_data_extract_geojson_uid_9aa3a4e0-7252-4ba6-89f9-8c730c2c2afd.geojson",
    "file_name": "fmtm_data_extract",
    "process_time": "a second",
    "query_area": "0.04 Sq Km",
    "binded_file_size": "0.06 MB",
    "zip_file_size_bytes": 63219
  },
  "extracted": false,
  "original_path": null,
  "extracted_files": null
}

There are repeated keys and values in both metadata and data . We can keep metadata such as process time, filename, query area and so on in the metadata section itself. Need to refactor it

Agree, there is a bit of duplication!

For a future PR 😁

Sujanadh requested a review from spwoodcock June 20, 2025 12:31

Sujanadh self-assigned this Jun 20, 2025

github-actions bot added the bug Something isn't working label Jun 20, 2025

fix(config): add stream parameter to check if download is required or…

a0547a7

… not

Sujanadh force-pushed the fix/allow-stream-data-url branch from de435d6 to a0547a7 Compare June 20, 2025 12:36

[pre-commit.ci] auto fixes from pre-commit.com hooks

e77476e

for more information, see https://pre-commit.ci

spwoodcock requested a review from emirfabio June 20, 2025 15:09

spwoodcock reviewed Jun 20, 2025

View reviewed changes

spwoodcock and others added 4 commits June 20, 2025 16:36

build: add pkg build config, plus pytest config

5c4b612

fix: get stream param from kwargs

02ee45c

refactor: replace typing imports with direct python types

a021be2

[pre-commit.ci] auto fixes from pre-commit.com hooks

0ff7dc6

for more information, see https://pre-commit.ci

github-actions bot added dependency docs labels Jun 20, 2025

spwoodcock added 3 commits June 20, 2025 17:52

refactor: add download_data to RawDataOutputOptions to allow stream

8672a39

docs: add info about using download_file=False

cea04cf

Merge branch 'fix/allow-stream-data-url' of https://github.com/hotosm…

8033594

…/raw-data-api-py into fix/allow-stream-data-url

spwoodcock requested review from spwoodcock and removed request for spwoodcock June 20, 2025 16:57

[pre-commit.ci] auto fixes from pre-commit.com hooks

89038b0

for more information, see https://pre-commit.ci

spwoodcock merged commit 7a6643f into main Jun 25, 2025
2 checks passed

spwoodcock deleted the fix/allow-stream-data-url branch June 25, 2025 08:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add stream parameter to check if download is required or not #4

Add stream parameter to check if download is required or not #4

Uh oh!

Sujanadh commented Jun 20, 2025

Uh oh!

spwoodcock Jun 20, 2025

Uh oh!

spwoodcock Jun 20, 2025

Uh oh!

spwoodcock commented Jun 20, 2025

Uh oh!

spwoodcock commented Jun 20, 2025

Uh oh!

spwoodcock commented Jun 23, 2025

Uh oh!

Sujanadh commented Jun 25, 2025

Uh oh!

spwoodcock commented Jun 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add stream parameter to check if download is required or not #4

Add stream parameter to check if download is required or not #4

Uh oh!

Conversation

Sujanadh commented Jun 20, 2025

Description

Uh oh!

spwoodcock Jun 20, 2025

Choose a reason for hiding this comment

Uh oh!

spwoodcock Jun 20, 2025

Choose a reason for hiding this comment

Uh oh!

spwoodcock commented Jun 20, 2025

Uh oh!

spwoodcock commented Jun 20, 2025

Uh oh!

spwoodcock commented Jun 23, 2025

Uh oh!

Sujanadh commented Jun 25, 2025

Uh oh!

spwoodcock commented Jun 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants